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NOTES ON THE SIGNIFICANCE AND USE OF THE 

HILLEGAS SCALE FOR MEASURING THE QUALITY 

OF ENGLISH COMPOSITION 



EDWARD L. THORNDIKE 
Teachers College, Columbia University 



It is obvious that specimen 294 1 has more merit as English 
writing than specimen 519. It is not obvious that specimen 294 
has more merit than specimen 225. But the same argument by 
which we justify tLe assertion that No. 294 has much more merit 
than No. 519 justifies the assertion that it has somewhat more 
merit than No. 225. The argument is simply an appeal to experts. 
Out of one hundred and sixty members of the English Section of 
the 191 2 Conference at the University of Illinois, 70 per cent judged 
that 294 had more merit than 225. Now, as has been explained by 
Dr. Hillegas and the author, it is possible to transmute the measures 
of difference in merit contained in the percentages of judgments 
of superiority into units of amount of difference, so that, for 
example, we know that the difference between No. 294 and No. 519 
is about three and a half times the difference between No. 294 and 
No. 225. Calling 1.00 that amount of difference so great (or, 
better, so small) that only seventy-five out of a hundred such 
experts rank the two samples correctly, twenty-five putting the 
worse sample ahead of the better, and deciding on what kind of 
writing has just zero or just not any merit, we can find samples that 
are each just 1 .00 better than zero; others that are each just 1 .00 
better than these or 2 . 00 better than zero ; others that are each just 
1. 00 better than these or 3.00 better than zero; and so on. If 
sample 580 is taken as zero, samples 94, 519, 534, 196, and 221 are 
approximately 3 . 7, 4.7, 5.7, 6.7, and 7.7. Now samples 94 to 
221, or amounts of merit 4.7 to 7.7, give roughly the quality of 
work which our high-school pupils display in examination papers, 

1 The various specimens referred to appear below. See pp. 537-561. 
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in set themes and the like. A few high-school compositions will 
run from 8 . o up to 9 . o and a few from 4 . o down to 3 . o. 

It is the purpose of this paper to measure roughly the difference 
between the paragraph-writing of boys and girls in high school and 
that selected as the specially good performances of recognized 
masters of English prose. 

The facts are, very simply, as follows: specimen 258 is by 
Washington Irving; specimen 217 is by Hawthorne; specimen 296 
is by Thackeray. These were chosen for the author by a teacher 
of English as such samples of the better work of these authors as 
made convenient units for isolated estimation. Specimen 231 is by 
a college Freshman; specimen 294 is by a high-school pupil; 
specimens 221, 225, 196, 245, 329, 338, and 519 are high-school 
specimens, covering the range from 8.0 to a little below 5.0. 

Now specimen 519 has the least merit of any of these. Call its 
merit, for the present, x. Then, by the judgments of 160 members 
of the English Section of the Illinois Conference of 191 2, the speci- 
mens are ranked in the following order and have the following 
amounts of merit. 

(H.S.) 519 is of merit x. 

(H.S.) 338 is judged better than 519 by 56J per cent of the judges, and is of 

merit x-\-o. 23 (1.00 being defined above). 
(H.S.) 329 is judged better than 338 by 76. 6 per cent of the judges, and is of 

merit £+.23+1.08, or 3+1.31. 
(H.S.) 196 is judged better than 329 by 66J per cent of the judges, and is of 

merit £+1.31 + . 62, or 3+1.93. 
(Thackeray) 269 is judged better than 196 by 65 per cent of the judges, and is 

of merit £+1.93+. 57, or s+2. 5. 
(H.S. and H.S.) 221 and 225 are judged one a trifle worse and one a trifle better 

than 296 (48. 1 per cent and 53. 1 per cent), averaging practically the 

same merit of x+2. 5. 
(Coll. Fresh.) 231 is judged better than 296 (Thackeray) by 70 per cent of the 

judges, and is of merit x-\-2. 5+ . 78, or x+3. 28. 
(H.S.) 294 is judged better than 231 by 555 per cent of the judges and is of 

merit x+3 . 28+ .21, or .1+3 . 49. 
(Hawthorne) 217 is judged better than 294 by 63J per cent of the judges, and 

is of merit x+3. 49+. 52, or x+4.01. 
(Irving) 258 is judged better than 217 by 55I per cent of the judges, and is of 

merit X+4.01 + . 21, or x+4. 22. 
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Now, assuming the freedom of these 160 judges from any 
unfairness in favor of the high-school specimens or against the 
standard writers (whatever prejudice there was worked probably 
in the other way), it is clear that the difference between the two 
groups is, in certain important senses, very small. Specimen 519, 
which has little more merit than the worst of high-school com- 
positions, is only one and a half times as far below the Thackeray 
passage as that is below the Irving passage. It would in fact be 
very easy to find many paragraphs in the "standard" essayists, 
historians, and novelists that would have been credited with less 
than £+2.25 merit by this group of teachers of English. A very 
fair percentage of high-school compositions would be credited above 
x-\-2 . 25. The paragraph- writing of pupils in our high schools and 
that of the world's hundred best English writers undoubtedly 
overlap considerably in merit. Assume for the sake of illustration 
that the average writing of a fourth-year class is of merit x+ 2 . o 
and that the average writing of the hundred masters is of merit 
x+3. 75. Then if a number of samples of the former were paired 
with a number of samples of the latter and judged by this group of 
teachers, the superiority of the latter would be far from obvious. 
In fact, in the long run, twelve out of a hundred of the teachers of 
English would rank the high-school specimen above the master's. 
Now x+ 2 is certainly not much too high for fourth-year work in a 
good school and x+3.75 is certainly not much too low for the 
average paragraph of the masters of English prose. 

The fact may be stated more simply if the difference between x 
and zero — that is, the absolute merit of specimens such as No. 
519 — is determined. If these one hundred and sixty judges had 
ranked also in the same way specimens running from No. 519 
down to some as bad as No. 580, 580 would have been found to be 
about equal to x— 4 . 1. That is, No. 580 would have been put not 
quite as far below No. 519 as No. 258 was put above it. If the 
one hundred and sixty judges had each made up artificially a 
paragraph that represented his notion of zero, or just not any, 
merit— the merit of a paragraph in which merit in English com- 
position is just barely beginning to be observable — these zero 
specimens would on the average have been not much, if any, worse 
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than specimen 580. We are fairly safe in saying that x is at least 
3 . 5 and not over 4 . 5 above absolute zero in the opinion of these 
one hundred and sixty judges. Call x equal to 4.0. Then the 
absolute values of the specimens are: 

519, representing nearly the lower limit of high-school paragraph-writing, is 4. 

338is4.23. 32ois5.3i. 245*35.79. 277135.93. 

296 (Thackeray) is 6. 5. 

221 (H.S.) and 225 (H.S.) are about 6.5. 

231 (Coll. Fresh.) is 7.28. 294 (H.S.) is 7.49. 

217 (Hawthorne) is 8.01. 258 (Irving) is 8.22. 

Consequently, in the judgment of high-school teachers of 
English, the worst tenth of paragraph-writing of high-school pupils 
is still nearly half-way from zero toward the best the world knows. 
What we rightly consider a mediocre composition, such as 245, still 
represents nearly three-fourths of the progress possible. 

I will not continue with similar comparisons nor draw any of the 
many, and, as I think, important practical conclusions which these 
measurements warrant, but will only restate the measurements 
themselves in a metaphor. 

If ten men, A, B, C .... J, ran one at a time a hundred 
yards in 10, 10J, n, ... . 14J seconds, respectively, and if, after 
each successive pair had run, we had to judge (without watches or 
counting) which ran faster, the judgment of a hundred and sixty 
observers, even if well trained, would often err. The quicker 
runner of the two would get only a plurality unless the difference 
in time was great. We should, applying just the same treatment 
to the judgments that was applied to the compositions, come out 
with a difference between A and J of perhaps only 4.0, and find 
that the fastest running in the world was not, after all, remote from 
an average high-school boy's performance. If we decided that 
merit in running began at a rate of five feet or less a second we 
should find that our average school boy had already made three 
quarters or more of the progress possible. Teachers of athletics 
would disagree very widely in the "marks" that they gave to the 
same feat of running; and we should quarrel bitterly over the 
respective merits of A and B! We should reflect, perhaps with 
surprise, that the world pays enormous sums in money and fame for 
a difference so small that one person out of four cannot see it! 
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It may be well to meet one criticism which will be thought of by 
many of the hundred and sixty teachers who gave the judgments 
— the criticism that these judgments (made in forty minutes) were 
hasty and necessarily inaccurate. This was the fact, but its only 
effect on the issue was to make all the differences represented by 
1 . 00 larger than they would have been with more time and care. 
All the relations shown by careful judgments would be the same as 
those stated here, but the values themselves would range more 
widely. The difference between 580 and 519 would be perhaps 
5 .00 instead of 4. 1 and the difference between 519 and 258 would 
be perhaps 5 . 2 instead of 4 . 2. All the differences would be swollen 
in the same proportions. The difference noted by seventy-five out 
of a hundred judges working rapidly and called 1 . 00 here, would, 
with more care, be noted by say eighty out of a hundred and so be 
called 1.25 in a study made with very careful judgments. The 
unit 1 . 00 means an amount of difference observable by three-fourths 
of certain specified judges under certain specified conditions. 
Improve the fineness of discrimination of the judges or the con- 
ditions under which they judge, and 1 . 00 means a smaller difference. 
Samples 94, 519, 534, 196, and 221 were given values of 3.7, 4.7, 
5.7, 6.7, and 7. 7 by such expert and careful judgments. By the 
one hundred and sixty judgments considered here sample 519 is 
put only 2.5 below sample 221. 

The second purpose of this paper is to measure the amount of 
error to be expected in grading specimens of English writing by a 
scale such as that furnished by samples 94, 519, 534, 196, and 221 
of values 3.7, 4.7, 5.7, 6.7, and 7.7, respectively. It is obvious 
that, since 1 . 00 represents a difference which three out of four 
careful judges will fail to see, the average error in giving any 
specimen a number in comparison with the scale must be large. 

For example, specimen 231 was ranked as: 

Worse than 338 by 31 per cent of the 160 judges 

" " 329 but better than 338 by 6| per cent of the judges 

329 by 15 per cent of the judges 
2 77 by s| per cent of the judges 
296 by 25J per cent of the judges 
Better " 294 " worse " 217 by 145 per cent of the judges 

258 by 35 per cent of the judges 
258 by 265 per cent of the judges 



277 


III I 


296 ' 


i n l 


294 ' 


in l 


294 


' worse ' 


217 


1 11 1 



556 THE ENGLISH JOURNAL 

That is, a specimen which the general opinion of the one hundred 
and sixty ranks as 7 . 27 is ranked below 4 . 2 by some; and, since it 
is ranked above 8 . 22 by over a fourth of the judges, would probably 
be ranked as nearly 10 . o by others. 

If a new sample (call it N) is really of merit 5.7, for example, 
even careful judges will tend to regard it as worse than 4.7 in one 
case out of four, and as better than 6 . 7 in one case out of four (the 
same judge, of course, cannot make both of these errors). So 
(except for the influence of the scale as a whole) one-fourth of the 
grades assigned to N will be below 4 . 7 and one-fourth above 6.7; 
the median error will be 1.00 and the average error about 1.18. 
The effect of the scale as a whole is complex, and I will not figure out 
the probabilities for it. A judge comparing our supposed N of real 
value 5.7, might, for example, regard it as worse than both 519 
(4.7) and 196 (6.7), rating it 4.3, if these two and it were the only 
means of estimate, but might regard it as equal to 534 (5 . 7) if N and 
534 were the only means of comparison. If the whole scale is given 
and if he is converted to the belief that 534 is half-way between 519 
and 196, and recognizes also that N is very much better than 94 
(3 . 7) and not very much worse than 196 (6.7), then he may judge 
N to be 5.0 or 5.5, improving his judgment markedly. 

Professor Hillegas is measuring the errors made in using such a 
scale; and Mr. Johnston reported at the Illinois Conference a most 
interesting series of such measurements. 1 Three facts will be 
proved as such studies progress. First, the errors will be large; 
second, they will diminish with practice in using such a scale and 
with improvements in the scale itself; third, they will — at least, 
after sufficient practice — be smaller than the errors now made 
by teachers in grading paragraph-writing for general merit. The 
reason for the last fact is that at present a teacher, in grading a 
composition for general merit, uses a subjective, personal scale of 
values which, in the nature of the case, cannot, on the average, be 
as correct as one due to the combined opinions of a hundred or more 
judges who are on the average as competent as he is. The teacher 
now adds the errors of his personal subjective scale of values to the 
errors of comparing a specimen therewith. A scale such as has been 

1 See School Review for January, 1013. 
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referred to here, and such as Professor Hillegas has worked out, 
eliminates the former errors altogether and, if the teacher has had 
enough practice with it, cannot increase, and probably will decrease, 
the errors of comparison. 

SPECIMENS OF ENGLISH WRITING REFERRED TO IN THE TEXT 

94. 

Sulla as a Tyrant 
When Sulla came back from his conquest Marius had put himself consul so 
sulla with the army he had with him in his conquest siezed the government from 
Marius and put himself in consul and had a list of his enemys printy and the 
men whoes names were on this list we beheaded. 

196. 

Ichabod Crane 
Ichabod Crane was a schoolmaster in a place called Sleepy Hollow. He 
was tall and slim with broad shoulders, long arms that dangled far below his 
coat sleeves. His feet looked as if they might easily have been used for shovels. 
His nose was long and his entire frame was most loosely hung to-gether. 

217 

Selection From Hawthorne 
Oh that I had never heard of Niagara till I beheld it! Blessed were the 
wanderers of old, who heard its deep roar, sounding through the woods, as the 
summons to an unknown wonder, and approached its awful brink, in all the 
freshness of native feeling. Had its own mysterious voice been the first to 
warn me of its existence, then, indeed, I might have knelt down and wor- 
shipped. But I had come hither, haunted with a vision of foam and fury, and 
dizzy cliffs, and an ocean tumbling down out of the sky — a scene, in short, 
which nature had too much good taste and calm simplicity to realize. My 
mind had struggled to adapt these false conceptions to the reality, and finding 
the effort vain, a wretched sense of disappointment weighed me down. I 
climbed the precipice, and threw myself on the earth feeling that I was 
unworthy to look at the Great Falls, and careless about beholding them again. 

221 

Going Down with Victory 
As we road down Lombard Street, we saw flags waving from nearly every 
window. I surely felt proud that day to be the driver of the gaily decorated 
coach. Again and again we were cheered as we drove slowly to the post- 
masters, to await the coming of his majestie's mail. There wasn't one of the 
gaily bedecked coaches that could have compared with ours, in my estimation. 
So with waving flags and fluttering hearts we waited for the coming of the mail 
and the expected tidings of victory. 
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When at last it did arrive the postmaster began to quickly sort the bundles, 
we waited anxiously. Immediately upon receiving our bundles, I lashed the 
horses and they responded with a jump. Out into the country we drove at 
reckless speed — everywhere spreading like wildfire the news, "Victory!" 
The exileration that we all felt was shared with the horses. Up and down 
grade and over bridges, we drove at breakneck speed and spreading the news at 
every hamlet with that one cry "Victory!" When at last we were back home 
again, it was with the hope that we should have another ride some day with 
"Victory." 

225 

Before the Renaissance, artists and sculptors made their statues and 
pictures thin, and weak looking figures. They saw absolutely no beauty in 
the human body. At the time of the Renaissance, artists began to see beauty 
in musclar and strong bodies, and consequently many took warriors as subjects 
for their statues. Two of the statues that Michel Angelo, the great sculptor 
and artist, made, Perseus with the head of Medusa, and David with Goliath's 
head, are very similar. They show minutely and with wonderful exactness 
every muscle of the body. Michel Angelo was a great student of the body, 
especially when it was in a strained position. The position of the figures on the 
tomb of Lorenzo the Great is so wonderful that one can almost see the tension 
of the muscles. 

231 

A Foreigner's Tribute to Joan of Arc 
Joan of Arc, worn out by the suffering that was thrust upon her, netherthe- 
less appeared with a brave mien before the Bishop of Beauvais. She knew, 
had always known that she must die when her mission was fulfilled and death 
held no terrors for her. To all the bishop's questions she answered firmly and 
without hesitation. The bishop failed to confuse her and at last condemned 
her to death for heresy, bidding her recant if she would live. She refused and 
was lead to prison, from there to death. 

While the flames were writhing around her she bade the old bishop who 
stood by her to move away or he would be injured. Her last thought was of 
others and De Quincy says, that recant was no more in her mind than on her 
lips. She died as she lived, with a prayer on her lips and listening to the 
voices that had whispered to her so often. 

The heroism of Joan of Arc was wonderful. We do not know what form 
her great patriotism took or how far it really led her. She spoke of hearing 
voices and of seeing visions. We only know that she resolved to save her 
country, knowing though she did so, it would cost her her life. Yet she never 
hesitated. She was uneducated save for the lessons taught her by nature. 
Yet she led armies and crowned the dauphin, king of France. She was only a 
girl, yet she could silence a great bishop by words that came from her heart 
and from her faith. She was only a woman, yet she could die as bravely as 
any martyr who had gone before. 
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245 

I am going to Princeton partly because it was my father's college. I also 
prefer to go to a college away from home. You get the college life much more 
that way. My main reason is on account of the great advantages held forth 
in the preceptorial system. The preceptorial system is organized as follows. 
Imagine a class, junior for example of perhaps three hundred, divided into 
sections of twenty five each. For each of these sections there are six preceptors, 
men engaged to head groups of four or five to talk over their work with them 
and give them points and suggestions about it. The advantage of this is that 
the man gets a great deal more individual attention in this manner than he 
otherwise would. Princeton has high standards of intellectuality as well 
as athletics. 

258 
Selections From Irving 

In the meantime, the seasons gradually rolled on. The little frogs which 
had piped in the meadows in early spring, croaked as bull-frogs during the 
summer heats, and then sank into silence. The peach-tree budded, blossomed, 
and bore its fruit. The swallows and martins came, twittered about the roof, 
built their nests, reared their young, held their congress along the eaves, and 
then winged their flight in search of another spring. The caterpillar spun its 
winding-sheet, dangled in it from the great button-wood tree before the house; 
turned into a moth, fluttered with the last sunshine of summer, and disap- 
peared; and finally the leaves of the button-wood tree turned yellow, then 
brown, then rustled one by one to the ground, and, whirling about in little 
eddies of wind and dust, whispered that winter was at hand. 

294 

Among the beautiful islands on the Canadian side of the St. Lawrence 
River, there is a deep and narrow channel which separates three small wooded 
islands from a large fertile one. Of the three islands the largest is rocky and 
covered with a growth of stately pines and waving hemlocks, and a carpet of 
moss and ferns. On the second there is quite an assortment of trees, whose 
foliage during the fall turns to many shades of gold and red, which colors are 
greatly enhanced by the dark green background of its neighbor. On the third 
there is a thick growth of brush, with an occasional small tree. These three 
islands are so close together, that fallen trees and logs make it possible to walk 
from one to another. 

296 
Selections From Thackeray 

How one loves to see the burly figure of him, this thickskinned, seemingly 
opaque, perhaps sulky, almost stupid man of practice, pitted against some light 
adroit man of theory, all equipt with clear logic, and able anywhere to give 
you why for wherefore ! The adroit man of theory, so light of movement, clear 
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of utterance, with his bow full-bent and quiver full of arrow-arguments — surely 
he will strike down the game, transfix everywhere the heart of the matter; 
triumph everywhere, as he proves that he shall and must do ? To your aston- 
ishment, it turns out oftenest no. The cloudy-browed, thick-soled, opaque 
practicality, with no logic-utterance, in silence mainly, with here and there a 
low grunt or growl, has in him what transcends all logic-utterances; a congruity 
with the unuttered. The speakable, which lies atop, as a superficial film, or 
outer skin, is his or is not his: but the doable, which reaches down to the 
world's center, you find him there! 

329 

When Abraham was twenty one the family moved to Decatur where he 
made his first public speech. Here he built a boat and went to New Orleans 
where for the first time he saw slaves. Then he vowed he put and end to it 
someday if he could. When he returned he went to New Salem where he was 
postmaster and store clerk He was then elected to the legislature. He studied 
law and when twenty eight he was admitted to the bar. Then after a few years 
he was elected president and office which he filled as few men would at his time. 
When he was elected his troubles began. He was against slavery the states 
left the union. At the war which freed the slaves came. During this war 
Lincoln showed his kind heartness by pardoning so many men. He did not 
like to see these men shot leaving their wives and families fatherless. 

338 

This man who is the chief character of this story, is the stingest man in 
town one day before Christmas and the nicest man on Christmas, and all this 
comes from a dream. His name is Soloman and in his dream he dreams of 
coming home to his old cheap looking home, in an old side alley and, as he gets 
to the door this gosts head appears and as he open the door it departs, lighting 
a match to go up stairs with, not fearing the gost, and then starts up stairs and 
he had no sooner reached the top step when there was and awful clammer of 
chains and bells, As he walks into his room he hears the sound coming up the 
stairs nearer and nearer to his room every minute, And after he got in bed 
and blew out the light, he heard the gost walk right in his room and call him so 
he got up, being scared and afraid the gost would harm him, the gost told him 
to sit down beside him which the did, And then he said that he was Soloman 
partner and had died twenty years ago. 

519 

De Quincy 

First: De Quincys mother was a beautiful women and through her De 
Quincy inhereted much of his genius. 

His running away from school enfluenced him much as he roamed through 
the woods, valleys and his mind became very meditative. 
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The greatest enfluence of De Quincy's life was the opium habit. If it was 
not for this habit it is doubtful whether we would now be reading his writings. 

His companions during his college course and even before that time were 
great enfiuences. The surroundings of De Quincy were enfluences. Not only 
De Quincy's habit of opium but other habits which were peculiar to his life. 

His marriage to the woman which he did not especially care for. 

The many well educated and noteworthy friends of De Quincy. 

534 

Fluellen 
The passages given show the following characteristic of Fluellen: his 
inclination to brag, his professed knowledge of History, his complaining 
character, his great patriotism, pride of his leader, admired honesty, revengeful, 
love of fun and punishment of those who deserve it. 

580 

Letter 
Dear Sir: I write to say that it aint a square deal Schools is I say they 
is I went to a school, red and gree green and brown aint it hito bit I say he 
don't know his business not today nor yeaterday and you know it and I want 
Jennie to get me out. 



